Overview

Dataset statistics

Number of variables13
Number of observations20490
Missing cells88
Missing cells (%)< 0.1%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory2.0 MiB
Average record size in memory104.0 B

Variable types

Numeric10
Text3

Alerts

Unnamed: 0 is highly overall correlated with df_indexHigh correlation
df_index is highly overall correlated with Unnamed: 0High correlation
vividness is highly overall correlated with passive voiceHigh correlation
passive voice is highly overall correlated with vividnessHigh correlation
all adverbs is highly overall correlated with ly-adverbs and 1 other fieldsHigh correlation
ly-adverbs is highly overall correlated with all adverbsHigh correlation
non-ly-adverbs is highly overall correlated with all adverbsHigh correlation
year is highly skewed (γ1 = -24.16478369)Skewed
Unnamed: 0 is uniformly distributedUniform
Unnamed: 0 has unique valuesUnique
df_index has unique valuesUnique

Reproduction

Analysis started2023-06-11 17:22:43.967476
Analysis finished2023-06-11 17:23:01.952007
Duration17.98 seconds
Software versionydata-profiling v0.0.dev0
Download configurationconfig.json

Variables

Unnamed: 0
Real number (ℝ)

HIGH CORRELATION  UNIFORM  UNIQUE 

Distinct20490
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean10244.5
Minimum0
Maximum20489
Zeros1
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size160.2 KiB
2023-06-11T10:23:02.119494image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1024.45
Q15122.25
median10244.5
Q315366.75
95-th percentile19464.55
Maximum20489
Range20489
Interquartile range (IQR)10244.5

Descriptive statistics

Standard deviation5915.0978
Coefficient of variation (CV)0.57739254
Kurtosis-1.2
Mean10244.5
Median Absolute Deviation (MAD)5122.5
Skewness0
Sum2.099098 × 108
Variance34988382
MonotonicityStrictly increasing
2023-06-11T10:23:02.263546image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 1
 
< 0.1%
14086 1
 
< 0.1%
13664 1
 
< 0.1%
13663 1
 
< 0.1%
13662 1
 
< 0.1%
13661 1
 
< 0.1%
13660 1
 
< 0.1%
13659 1
 
< 0.1%
13658 1
 
< 0.1%
13657 1
 
< 0.1%
Other values (20480) 20480
> 99.9%
ValueCountFrequency (%)
0 1
< 0.1%
1 1
< 0.1%
2 1
< 0.1%
3 1
< 0.1%
4 1
< 0.1%
5 1
< 0.1%
6 1
< 0.1%
7 1
< 0.1%
8 1
< 0.1%
9 1
< 0.1%
ValueCountFrequency (%)
20489 1
< 0.1%
20488 1
< 0.1%
20487 1
< 0.1%
20486 1
< 0.1%
20485 1
< 0.1%
20484 1
< 0.1%
20483 1
< 0.1%
20482 1
< 0.1%
20481 1
< 0.1%
20480 1
< 0.1%

df_index
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct20490
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean10982.804
Minimum0
Maximum22428
Zeros1
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size160.2 KiB
2023-06-11T10:23:02.468693image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1066.45
Q15455.25
median10978.5
Q316483.75
95-th percentile20921.55
Maximum22428
Range22428
Interquartile range (IQR)11028.5

Descriptive statistics

Standard deviation6367.7255
Coefficient of variation (CV)0.57979051
Kurtosis-1.2002347
Mean10982.804
Median Absolute Deviation (MAD)5514.5
Skewness0.0040537727
Sum2.2503765 × 108
Variance40547928
MonotonicityStrictly increasing
2023-06-11T10:23:02.813616image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 1
 
< 0.1%
15098 1
 
< 0.1%
14650 1
 
< 0.1%
14649 1
 
< 0.1%
14648 1
 
< 0.1%
14647 1
 
< 0.1%
14645 1
 
< 0.1%
14644 1
 
< 0.1%
14643 1
 
< 0.1%
14641 1
 
< 0.1%
Other values (20480) 20480
> 99.9%
ValueCountFrequency (%)
0 1
< 0.1%
1 1
< 0.1%
2 1
< 0.1%
3 1
< 0.1%
4 1
< 0.1%
5 1
< 0.1%
6 1
< 0.1%
7 1
< 0.1%
8 1
< 0.1%
9 1
< 0.1%
ValueCountFrequency (%)
22428 1
< 0.1%
22417 1
< 0.1%
22361 1
< 0.1%
22256 1
< 0.1%
22128 1
< 0.1%
22127 1
< 0.1%
22126 1
< 0.1%
22125 1
< 0.1%
22124 1
< 0.1%
22122 1
< 0.1%

title
Text

Distinct19552
Distinct (%)95.4%
Missing0
Missing (%)0.0%
Memory size160.2 KiB
2023-06-11T10:23:03.111681image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Length

Max length90
Median length64
Mean length16.658175
Min length1

Characters and Unicode

Total characters341326
Distinct characters91
Distinct categories13 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique18834 ?
Unique (%)91.9%

Sample

1st rowThe Vanished Birds
2nd rowThe Price of Honor
3rd rowThe Case of the Baker Street Irregulars
4th rowWildoak
5th rowThe Holiday
ValueCountFrequency (%)
the 8294
 
13.3%
of 2927
 
4.7%
a 1230
 
2.0%
in 805
 
1.3%
and 774
 
1.2%
to 489
 
0.8%
you 336
 
0.5%
for 289
 
0.5%
girl 286
 
0.5%
on 278
 
0.4%
Other values (11080) 46817
74.9%
2023-06-11T10:23:03.607392image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
42044
 
12.3%
e 37352
 
10.9%
o 20703
 
6.1%
a 19021
 
5.6%
r 18797
 
5.5%
n 17830
 
5.2%
t 17432
 
5.1%
i 17291
 
5.1%
h 15092
 
4.4%
s 14462
 
4.2%
Other values (81) 121302
35.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 241848
70.9%
Uppercase Letter 54369
 
15.9%
Space Separator 42044
 
12.3%
Other Punctuation 1356
 
0.4%
Final Punctuation 1038
 
0.3%
Decimal Number 431
 
0.1%
Dash Punctuation 220
 
0.1%
Initial Punctuation 9
 
< 0.1%
Open Punctuation 4
 
< 0.1%
Close Punctuation 4
 
< 0.1%
Other values (3) 3
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 37352
15.4%
o 20703
 
8.6%
a 19021
 
7.9%
r 18797
 
7.8%
n 17830
 
7.4%
t 17432
 
7.2%
i 17291
 
7.1%
h 15092
 
6.2%
s 14462
 
6.0%
l 11188
 
4.6%
Other values (22) 52680
21.8%
Uppercase Letter
ValueCountFrequency (%)
T 8839
16.3%
S 4962
 
9.1%
D 3292
 
6.1%
B 3243
 
6.0%
M 3227
 
5.9%
A 3086
 
5.7%
W 3004
 
5.5%
C 2933
 
5.4%
H 2542
 
4.7%
L 2505
 
4.6%
Other values (16) 16736
30.8%
Other Punctuation
ValueCountFrequency (%)
: 548
40.4%
, 432
31.9%
. 127
 
9.4%
' 115
 
8.5%
& 81
 
6.0%
! 22
 
1.6%
? 21
 
1.5%
/ 4
 
0.3%
# 3
 
0.2%
* 1
 
0.1%
Other values (2) 2
 
0.1%
Decimal Number
ValueCountFrequency (%)
1 96
22.3%
2 68
15.8%
0 57
13.2%
9 44
10.2%
3 43
10.0%
4 38
 
8.8%
6 30
 
7.0%
8 26
 
6.0%
7 16
 
3.7%
5 13
 
3.0%
Open Punctuation
ValueCountFrequency (%)
( 3
75.0%
[ 1
 
25.0%
Close Punctuation
ValueCountFrequency (%)
) 3
75.0%
] 1
 
25.0%
Space Separator
ValueCountFrequency (%)
42044
100.0%
Final Punctuation
ValueCountFrequency (%)
1038
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 220
100.0%
Initial Punctuation
ValueCountFrequency (%)
9
100.0%
Currency Symbol
ValueCountFrequency (%)
$ 1
100.0%
Other Symbol
ValueCountFrequency (%)
° 1
100.0%
Math Symbol
ValueCountFrequency (%)
+ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 296217
86.8%
Common 45109
 
13.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 37352
 
12.6%
o 20703
 
7.0%
a 19021
 
6.4%
r 18797
 
6.3%
n 17830
 
6.0%
t 17432
 
5.9%
i 17291
 
5.8%
h 15092
 
5.1%
s 14462
 
4.9%
l 11188
 
3.8%
Other values (48) 107049
36.1%
Common
ValueCountFrequency (%)
42044
93.2%
1038
 
2.3%
: 548
 
1.2%
, 432
 
1.0%
- 220
 
0.5%
. 127
 
0.3%
' 115
 
0.3%
1 96
 
0.2%
& 81
 
0.2%
2 68
 
0.2%
Other values (23) 340
 
0.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 340270
99.7%
Punctuation 1047
 
0.3%
None 9
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
42044
 
12.4%
e 37352
 
11.0%
o 20703
 
6.1%
a 19021
 
5.6%
r 18797
 
5.5%
n 17830
 
5.2%
t 17432
 
5.1%
i 17291
 
5.1%
h 15092
 
4.4%
s 14462
 
4.3%
Other values (72) 120246
35.3%
Punctuation
ValueCountFrequency (%)
1038
99.1%
9
 
0.9%
None
ValueCountFrequency (%)
é 2
22.2%
ô 2
22.2%
í 1
11.1%
° 1
11.1%
â 1
11.1%
ø 1
11.1%
ç 1
11.1%

author
Text

Distinct11194
Distinct (%)54.6%
Missing0
Missing (%)0.0%
Memory size160.2 KiB
2023-06-11T10:23:03.918068image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Length

Max length85
Median length55
Mean length13.895656
Min length4

Characters and Unicode

Total characters284722
Distinct characters94
Distinct categories11 ?
Distinct scripts3 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique7995 ?
Unique (%)39.0%

Sample

1st rowSimon Jimenez
2nd rowJonathan P. Brazee
3rd rowAnthony Boucher
4th rowC. C. Harrington
5th rowT. M. Logan
ValueCountFrequency (%)
j 653
 
1.4%
628
 
1.3%
james 488
 
1.0%
david 416
 
0.9%
a 411
 
0.9%
john 367
 
0.8%
m 355
 
0.8%
r 309
 
0.7%
l 286
 
0.6%
michael 282
 
0.6%
Other values (9617) 42914
91.1%
2023-06-11T10:23:04.407334image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
26626
 
9.4%
e 26421
 
9.3%
a 25132
 
8.8%
n 20256
 
7.1%
r 18514
 
6.5%
i 16003
 
5.6%
o 14380
 
5.1%
l 14048
 
4.9%
t 10001
 
3.5%
s 9816
 
3.4%
Other values (84) 103525
36.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 205415
72.1%
Uppercase Letter 47179
 
16.6%
Space Separator 26626
 
9.4%
Other Punctuation 4991
 
1.8%
Dash Punctuation 170
 
0.1%
Nonspacing Mark 152
 
0.1%
Final Punctuation 120
 
< 0.1%
Open Punctuation 32
 
< 0.1%
Close Punctuation 32
 
< 0.1%
Initial Punctuation 3
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 26421
12.9%
a 25132
12.2%
n 20256
9.9%
r 18514
9.0%
i 16003
 
7.8%
o 14380
 
7.0%
l 14048
 
6.8%
t 10001
 
4.9%
s 9816
 
4.8%
h 7901
 
3.8%
Other values (33) 42943
20.9%
Uppercase Letter
ValueCountFrequency (%)
M 4141
 
8.8%
S 4097
 
8.7%
J 3729
 
7.9%
C 3670
 
7.8%
A 3264
 
6.9%
R 2753
 
5.8%
D 2747
 
5.8%
B 2728
 
5.8%
K 2567
 
5.4%
L 2433
 
5.2%
Other values (19) 15050
31.9%
Nonspacing Mark
ValueCountFrequency (%)
́ 114
75.0%
̈ 17
 
11.2%
̃ 6
 
3.9%
̀ 5
 
3.3%
̧ 4
 
2.6%
̌ 3
 
2.0%
̇ 1
 
0.7%
̊ 1
 
0.7%
̂ 1
 
0.7%
Other Punctuation
ValueCountFrequency (%)
. 4274
85.6%
& 628
 
12.6%
, 68
 
1.4%
' 21
 
0.4%
Final Punctuation
ValueCountFrequency (%)
117
97.5%
3
 
2.5%
Decimal Number
ValueCountFrequency (%)
5 1
50.0%
0 1
50.0%
Space Separator
ValueCountFrequency (%)
26626
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 170
100.0%
Open Punctuation
ValueCountFrequency (%)
( 32
100.0%
Close Punctuation
ValueCountFrequency (%)
) 32
100.0%
Initial Punctuation
ValueCountFrequency (%)
3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 252594
88.7%
Common 31976
 
11.2%
Inherited 152
 
0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 26421
 
10.5%
a 25132
 
9.9%
n 20256
 
8.0%
r 18514
 
7.3%
i 16003
 
6.3%
o 14380
 
5.7%
l 14048
 
5.6%
t 10001
 
4.0%
s 9816
 
3.9%
h 7901
 
3.1%
Other values (62) 90122
35.7%
Common
ValueCountFrequency (%)
26626
83.3%
. 4274
 
13.4%
& 628
 
2.0%
- 170
 
0.5%
117
 
0.4%
, 68
 
0.2%
( 32
 
0.1%
) 32
 
0.1%
' 21
 
0.1%
3
 
< 0.1%
Other values (3) 5
 
< 0.1%
Inherited
ValueCountFrequency (%)
́ 114
75.0%
̈ 17
 
11.2%
̃ 6
 
3.9%
̀ 5
 
3.3%
̧ 4
 
2.6%
̌ 3
 
2.0%
̇ 1
 
0.7%
̊ 1
 
0.7%
̂ 1
 
0.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 284366
99.9%
Diacriticals 152
 
0.1%
Punctuation 123
 
< 0.1%
None 81
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
26626
 
9.4%
e 26421
 
9.3%
a 25132
 
8.8%
n 20256
 
7.1%
r 18514
 
6.5%
i 16003
 
5.6%
o 14380
 
5.1%
l 14048
 
4.9%
t 10001
 
3.5%
s 9816
 
3.5%
Other values (52) 103169
36.3%
Punctuation
ValueCountFrequency (%)
117
95.1%
3
 
2.4%
3
 
2.4%
Diacriticals
ValueCountFrequency (%)
́ 114
75.0%
̈ 17
 
11.2%
̃ 6
 
3.9%
̀ 5
 
3.3%
̧ 4
 
2.6%
̌ 3
 
2.0%
̇ 1
 
0.7%
̊ 1
 
0.7%
̂ 1
 
0.7%
None
ValueCountFrequency (%)
ö 31
38.3%
ø 9
 
11.1%
é 6
 
7.4%
ð 6
 
7.4%
ó 5
 
6.2%
á 4
 
4.9%
ï 3
 
3.7%
ü 3
 
3.7%
ł 2
 
2.5%
ñ 2
 
2.5%
Other values (10) 10
 
12.3%

total words
Real number (ℝ)

Distinct18537
Distinct (%)90.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean92534.069
Minimum984
Maximum612956
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size160.2 KiB
2023-06-11T10:23:04.563681image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Quantile statistics

Minimum984
5-th percentile31555.85
Q170073
median88249
Q3107297.25
95-th percentile163614.55
Maximum612956
Range611972
Interquartile range (IQR)37224.25

Descriptive statistics

Standard deviation43882.384
Coefficient of variation (CV)0.47422948
Kurtosis13.426078
Mean92534.069
Median Absolute Deviation (MAD)18546.5
Skewness2.2965898
Sum1.8960231 × 109
Variance1.9256636 × 109
MonotonicityNot monotonic
2023-06-11T10:23:04.752849image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
78171 4
 
< 0.1%
77033 4
 
< 0.1%
70067 4
 
< 0.1%
95740 4
 
< 0.1%
67066 4
 
< 0.1%
96888 4
 
< 0.1%
84667 4
 
< 0.1%
77028 4
 
< 0.1%
71622 4
 
< 0.1%
77111 3
 
< 0.1%
Other values (18527) 20451
99.8%
ValueCountFrequency (%)
984 1
< 0.1%
1503 1
< 0.1%
1769 1
< 0.1%
2021 1
< 0.1%
2334 1
< 0.1%
2398 1
< 0.1%
2417 1
< 0.1%
2524 1
< 0.1%
2602 1
< 0.1%
2638 1
< 0.1%
ValueCountFrequency (%)
612956 1
< 0.1%
612868 1
< 0.1%
593393 1
< 0.1%
567072 1
< 0.1%
535663 1
< 0.1%
503164 1
< 0.1%
502290 1
< 0.1%
501470 1
< 0.1%
486586 1
< 0.1%
480194 1
< 0.1%

vividness
Real number (ℝ)

Distinct5178
Distinct (%)25.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean47.655792
Minimum1
Maximum99.75
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size160.2 KiB
2023-06-11T10:23:04.898030image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile29.04
Q139.64
median47.2
Q355.14
95-th percentile68.3
Maximum99.75
Range98.75
Interquartile range (IQR)15.5

Descriptive statistics

Standard deviation12.060809
Coefficient of variation (CV)0.2530817
Kurtosis0.56222192
Mean47.655792
Median Absolute Deviation (MAD)7.76
Skewness0.26502122
Sum976467.17
Variance145.46311
MonotonicityNot monotonic
2023-06-11T10:23:05.073943image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
50.14 20
 
0.1%
41 17
 
0.1%
41.78 16
 
0.1%
47.46 16
 
0.1%
47.37 16
 
0.1%
48.08 16
 
0.1%
43.28 15
 
0.1%
48.66 15
 
0.1%
47.75 15
 
0.1%
43.13 15
 
0.1%
Other values (5168) 20329
99.2%
ValueCountFrequency (%)
1 1
 
< 0.1%
1.01 1
 
< 0.1%
1.02 3
< 0.1%
1.03 2
< 0.1%
1.05 2
< 0.1%
1.06 1
 
< 0.1%
1.07 4
< 0.1%
1.08 1
 
< 0.1%
1.11 1
 
< 0.1%
1.12 2
< 0.1%
ValueCountFrequency (%)
99.75 1
< 0.1%
99.71 1
< 0.1%
98.81 1
< 0.1%
98.31 1
< 0.1%
98.28 1
< 0.1%
97.14 1
< 0.1%
96.79 1
< 0.1%
96.56 1
< 0.1%
96.18 1
< 0.1%
96.17 1
< 0.1%

passive voice
Real number (ℝ)

Distinct798
Distinct (%)3.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean8.0364924
Minimum1.39
Maximum12.87
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size160.2 KiB
2023-06-11T10:23:05.211939image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Quantile statistics

Minimum1.39
5-th percentile5.89
Q17.23
median8.06
Q38.88
95-th percentile10.08
Maximum12.87
Range11.48
Interquartile range (IQR)1.65

Descriptive statistics

Standard deviation1.2654327
Coefficient of variation (CV)0.15746082
Kurtosis0.28332088
Mean8.0364924
Median Absolute Deviation (MAD)0.82
Skewness-0.15781941
Sum164667.73
Variance1.6013198
MonotonicityNot monotonic
2023-06-11T10:23:05.389685image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
8.2 86
 
0.4%
8.17 84
 
0.4%
7.92 82
 
0.4%
7.77 81
 
0.4%
7.76 81
 
0.4%
7.85 80
 
0.4%
7.38 80
 
0.4%
7.99 80
 
0.4%
7.83 79
 
0.4%
8.59 78
 
0.4%
Other values (788) 19679
96.0%
ValueCountFrequency (%)
1.39 1
< 0.1%
2.58 1
< 0.1%
2.66 1
< 0.1%
2.7 1
< 0.1%
2.72 1
< 0.1%
2.75 1
< 0.1%
2.84 1
< 0.1%
3 1
< 0.1%
3.02 1
< 0.1%
3.08 1
< 0.1%
ValueCountFrequency (%)
12.87 1
< 0.1%
12.7 1
< 0.1%
12.68 1
< 0.1%
12.42 1
< 0.1%
12.35 1
< 0.1%
12.31 1
< 0.1%
12.3 2
< 0.1%
12.26 1
< 0.1%
12.25 1
< 0.1%
12.24 1
< 0.1%

all adverbs
Real number (ℝ)

Distinct371
Distinct (%)1.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.943572
Minimum1.16
Maximum6.54
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size160.2 KiB
2023-06-11T10:23:05.529056image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Quantile statistics

Minimum1.16
5-th percentile2.11
Q12.58
median2.92
Q33.28
95-th percentile3.88
Maximum6.54
Range5.38
Interquartile range (IQR)0.7

Descriptive statistics

Standard deviation0.53744769
Coefficient of variation (CV)0.18258351
Kurtosis0.47830799
Mean2.943572
Median Absolute Deviation (MAD)0.35
Skewness0.34597734
Sum60313.79
Variance0.28885002
MonotonicityNot monotonic
2023-06-11T10:23:05.686494image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2.97 192
 
0.9%
2.92 185
 
0.9%
2.9 184
 
0.9%
2.78 166
 
0.8%
2.91 166
 
0.8%
2.81 164
 
0.8%
2.87 164
 
0.8%
2.96 163
 
0.8%
2.98 162
 
0.8%
2.94 162
 
0.8%
Other values (361) 18782
91.7%
ValueCountFrequency (%)
1.16 1
< 0.1%
1.17 1
< 0.1%
1.19 1
< 0.1%
1.27 1
< 0.1%
1.3 2
< 0.1%
1.31 1
< 0.1%
1.35 2
< 0.1%
1.36 1
< 0.1%
1.37 1
< 0.1%
1.38 2
< 0.1%
ValueCountFrequency (%)
6.54 1
< 0.1%
5.72 2
< 0.1%
5.39 1
< 0.1%
5.35 1
< 0.1%
5.25 1
< 0.1%
5.21 1
< 0.1%
5.19 1
< 0.1%
5.13 1
< 0.1%
5.12 1
< 0.1%
5.1 1
< 0.1%

ly-adverbs
Real number (ℝ)

Distinct251
Distinct (%)1.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.0162211
Minimum0
Maximum4.39
Zeros1
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size160.2 KiB
2023-06-11T10:23:05.846751image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0.51
Q10.78
median0.99
Q31.22
95-th percentile1.6055
Maximum4.39
Range4.39
Interquartile range (IQR)0.44

Descriptive statistics

Standard deviation0.33933842
Coefficient of variation (CV)0.33392185
Kurtosis1.3102762
Mean1.0162211
Median Absolute Deviation (MAD)0.22
Skewness0.59184623
Sum20822.37
Variance0.11515056
MonotonicityNot monotonic
2023-06-11T10:23:05.971286image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.83 284
 
1.4%
0.99 282
 
1.4%
0.96 272
 
1.3%
0.97 267
 
1.3%
1.02 265
 
1.3%
0.87 263
 
1.3%
0.85 261
 
1.3%
1.04 260
 
1.3%
1 257
 
1.3%
0.95 256
 
1.2%
Other values (241) 17823
87.0%
ValueCountFrequency (%)
0 1
 
< 0.1%
0.02 1
 
< 0.1%
0.04 1
 
< 0.1%
0.07 2
< 0.1%
0.08 1
 
< 0.1%
0.09 3
< 0.1%
0.1 1
 
< 0.1%
0.11 1
 
< 0.1%
0.12 2
< 0.1%
0.13 1
 
< 0.1%
ValueCountFrequency (%)
4.39 1
< 0.1%
3.18 2
< 0.1%
2.68 1
< 0.1%
2.65 1
< 0.1%
2.64 1
< 0.1%
2.6 1
< 0.1%
2.58 2
< 0.1%
2.53 1
< 0.1%
2.52 1
< 0.1%
2.51 1
< 0.1%

non-ly-adverbs
Real number (ℝ)

Distinct249
Distinct (%)1.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.9273812
Minimum0.71
Maximum3.87
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size160.2 KiB
2023-06-11T10:23:06.145242image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Quantile statistics

Minimum0.71
5-th percentile1.43
Q11.71
median1.91
Q32.13
95-th percentile2.48
Maximum3.87
Range3.16
Interquartile range (IQR)0.42

Descriptive statistics

Standard deviation0.32614114
Coefficient of variation (CV)0.16921466
Kurtosis0.7567445
Mean1.9273812
Median Absolute Deviation (MAD)0.21
Skewness0.39009848
Sum39492.04
Variance0.10636804
MonotonicityNot monotonic
2023-06-11T10:23:06.292727image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1.87 310
 
1.5%
1.9 307
 
1.5%
1.93 273
 
1.3%
1.94 272
 
1.3%
1.86 269
 
1.3%
1.82 266
 
1.3%
1.92 266
 
1.3%
1.89 264
 
1.3%
1.76 263
 
1.3%
1.79 263
 
1.3%
Other values (239) 17737
86.6%
ValueCountFrequency (%)
0.71 1
 
< 0.1%
0.85 1
 
< 0.1%
0.87 1
 
< 0.1%
0.9 1
 
< 0.1%
0.93 3
< 0.1%
0.94 4
< 0.1%
0.96 2
 
< 0.1%
0.97 2
 
< 0.1%
0.98 5
< 0.1%
0.99 1
 
< 0.1%
ValueCountFrequency (%)
3.87 1
< 0.1%
3.68 1
< 0.1%
3.53 1
< 0.1%
3.51 1
< 0.1%
3.48 1
< 0.1%
3.47 1
< 0.1%
3.44 1
< 0.1%
3.42 1
< 0.1%
3.41 1
< 0.1%
3.4 1
< 0.1%

year
Real number (ℝ)

Distinct216
Distinct (%)1.1%
Missing88
Missing (%)0.4%
Infinite0
Infinite (%)0.0%
Mean2010.2646
Minimum180
Maximum2023
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size160.2 KiB
2023-06-11T10:23:06.463690image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Quantile statistics

Minimum180
5-th percentile1977
Q12013
median2018
Q32020
95-th percentile2022
Maximum2023
Range1843
Interquartile range (IQR)7

Descriptive statistics

Standard deviation38.175697
Coefficient of variation (CV)0.018990384
Kurtosis920.61849
Mean2010.2646
Median Absolute Deviation (MAD)2
Skewness-24.164784
Sum41013419
Variance1457.3839
MonotonicityNot monotonic
2023-06-11T10:23:06.603599image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2020 3049
14.9%
2019 2757
13.5%
2018 2492
12.2%
2021 2070
 
10.1%
2017 1504
 
7.3%
2022 1228
 
6.0%
2016 1118
 
5.5%
2015 382
 
1.9%
2014 326
 
1.6%
2013 295
 
1.4%
Other values (206) 5181
25.3%
ValueCountFrequency (%)
180 1
< 0.1%
311 1
< 0.1%
351 1
< 0.1%
426 1
< 0.1%
701 1
< 0.1%
801 1
< 0.1%
1320 1
< 0.1%
1579 1
< 0.1%
1588 1
< 0.1%
1590 1
< 0.1%
ValueCountFrequency (%)
2023 158
 
0.8%
2022 1228
6.0%
2021 2070
10.1%
2020 3049
14.9%
2019 2757
13.5%
2018 2492
12.2%
2017 1504
7.3%
2016 1118
 
5.5%
2015 382
 
1.9%
2014 326
 
1.6%

genres
Text

Distinct3160
Distinct (%)15.4%
Missing0
Missing (%)0.0%
Memory size160.2 KiB
2023-06-11T10:23:06.783030image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Length

Max length97
Median length84
Mean length36.319131
Min length9

Characters and Unicode

Total characters744179
Distinct characters35
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1937 ?
Unique (%)9.5%

Sample

1st row['Science Fiction', 'Fantasy', 'Adult']
2nd row['Science Fiction']
3rd row['Mystery', 'Crime', 'Classics']
4th row['Historical Fiction', 'Young Adult']
5th row['Thriller', 'Mystery', 'Crime', 'Suspense']
ValueCountFrequency (%)
mystery 8000
11.7%
fiction 7947
11.7%
thriller 6659
9.8%
fantasy 5970
 
8.8%
adult 5222
 
7.7%
historical 4914
 
7.2%
science 3830
 
5.6%
crime 3765
 
5.5%
contemporary 3328
 
4.9%
romance 3234
 
4.7%
Other values (10) 15250
22.4%
2023-06-11T10:23:07.295827image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
' 115258
15.5%
r 51056
 
6.9%
47629
 
6.4%
i 45507
 
6.1%
e 43390
 
5.8%
t 39136
 
5.3%
, 37139
 
5.0%
o 33058
 
4.4%
n 32433
 
4.4%
a 30654
 
4.1%
Other values (25) 268919
36.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 435054
58.5%
Other Punctuation 152397
 
20.5%
Uppercase Letter 68119
 
9.2%
Space Separator 47629
 
6.4%
Close Punctuation 20490
 
2.8%
Open Punctuation 20490
 
2.8%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
r 51056
11.7%
i 45507
10.5%
e 43390
10.0%
t 39136
9.0%
o 33058
7.6%
n 32433
7.5%
a 30654
7.0%
s 29144
 
6.7%
y 28787
 
6.6%
l 25813
 
5.9%
Other values (8) 76076
17.5%
Uppercase Letter
ValueCountFrequency (%)
F 13917
20.4%
M 8984
13.2%
C 8151
12.0%
H 7721
11.3%
S 6767
9.9%
T 6659
9.8%
A 6565
9.6%
R 3234
 
4.7%
Y 2543
 
3.7%
P 1301
 
1.9%
Other values (2) 2277
 
3.3%
Other Punctuation
ValueCountFrequency (%)
' 115258
75.6%
, 37139
 
24.4%
Space Separator
ValueCountFrequency (%)
47629
100.0%
Close Punctuation
ValueCountFrequency (%)
] 20490
100.0%
Open Punctuation
ValueCountFrequency (%)
[ 20490
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 503173
67.6%
Common 241006
32.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
r 51056
 
10.1%
i 45507
 
9.0%
e 43390
 
8.6%
t 39136
 
7.8%
o 33058
 
6.6%
n 32433
 
6.4%
a 30654
 
6.1%
s 29144
 
5.8%
y 28787
 
5.7%
l 25813
 
5.1%
Other values (20) 144195
28.7%
Common
ValueCountFrequency (%)
' 115258
47.8%
47629
19.8%
, 37139
 
15.4%
] 20490
 
8.5%
[ 20490
 
8.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 744179
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
' 115258
15.5%
r 51056
 
6.9%
47629
 
6.4%
i 45507
 
6.1%
e 43390
 
5.8%
t 39136
 
5.3%
, 37139
 
5.0%
o 33058
 
4.4%
n 32433
 
4.4%
a 30654
 
4.1%
Other values (25) 268919
36.1%

num final genres
Real number (ℝ)

Distinct7
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.8125427
Minimum1
Maximum7
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size160.2 KiB
2023-06-11T10:23:07.407687image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median3
Q34
95-th percentile5
Maximum7
Range6
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.2048063
Coefficient of variation (CV)0.42836907
Kurtosis-0.68176067
Mean2.8125427
Median Absolute Deviation (MAD)1
Skewness0.13998482
Sum57629
Variance1.4515582
MonotonicityNot monotonic
2023-06-11T10:23:07.506248image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
3 6005
29.3%
2 4835
23.6%
4 4640
22.6%
1 3463
16.9%
5 1369
 
6.7%
6 170
 
0.8%
7 8
 
< 0.1%
ValueCountFrequency (%)
1 3463
16.9%
2 4835
23.6%
3 6005
29.3%
4 4640
22.6%
5 1369
 
6.7%
6 170
 
0.8%
7 8
 
< 0.1%
ValueCountFrequency (%)
7 8
 
< 0.1%
6 170
 
0.8%
5 1369
 
6.7%
4 4640
22.6%
3 6005
29.3%
2 4835
23.6%
1 3463
16.9%

Interactions

2023-06-11T10:22:59.933681image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-06-11T10:22:46.919515image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-06-11T10:22:48.401506image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-06-11T10:22:49.906039image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-06-11T10:22:51.336626image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-06-11T10:22:52.728819image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-06-11T10:22:54.120997image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-06-11T10:22:55.536750image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-06-11T10:22:57.073882image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-06-11T10:22:58.468310image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-06-11T10:23:00.075341image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-06-11T10:22:47.134285image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-06-11T10:22:48.531402image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-06-11T10:22:50.026775image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-06-11T10:22:51.463756image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-06-11T10:22:52.857236image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-06-11T10:22:54.248275image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-06-11T10:22:55.652945image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-06-11T10:22:57.202416image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-06-11T10:22:58.599868image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-06-11T10:23:00.195982image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-06-11T10:22:47.275437image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-06-11T10:22:48.636071image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-06-11T10:22:50.159511image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-06-11T10:22:51.609464image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-06-11T10:22:52.999808image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-06-11T10:22:54.390930image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-06-11T10:22:55.793822image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-06-11T10:22:57.358607image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-06-11T10:22:58.723846image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-06-11T10:23:00.353189image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-06-11T10:22:47.420663image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-06-11T10:22:48.780251image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-06-11T10:22:50.317589image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-06-11T10:22:51.770535image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-06-11T10:22:53.139417image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-06-11T10:22:54.531441image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-06-11T10:22:55.923477image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-06-11T10:22:57.495793image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-06-11T10:22:58.892362image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-06-11T10:23:00.492459image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-06-11T10:22:47.546146image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-06-11T10:22:48.921164image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-06-11T10:22:50.468679image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-06-11T10:22:51.890983image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-06-11T10:22:53.296798image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-06-11T10:22:54.690673image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-06-11T10:22:56.086801image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-06-11T10:22:57.627710image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-06-11T10:22:59.050817image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-06-11T10:23:00.629739image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-06-11T10:22:47.708064image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-06-11T10:22:49.062463image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-06-11T10:22:50.601505image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-06-11T10:22:52.049535image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-06-11T10:22:53.426200image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-06-11T10:22:54.833026image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-06-11T10:22:56.219961image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-06-11T10:22:57.753492image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-06-11T10:22:59.207877image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-06-11T10:23:00.773774image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-06-11T10:22:47.840074image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-06-11T10:22:49.190303image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-06-11T10:22:50.757928image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-06-11T10:22:52.169376image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-06-11T10:22:53.578856image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-06-11T10:22:54.942531image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-06-11T10:22:56.346245image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-06-11T10:22:57.889424image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-06-11T10:22:59.350164image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-06-11T10:23:00.899071image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-06-11T10:22:47.987602image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-06-11T10:22:49.299711image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-06-11T10:22:50.895674image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-06-11T10:22:52.302001image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-06-11T10:22:53.707694image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-06-11T10:22:55.083225image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-06-11T10:22:56.472417image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-06-11T10:22:58.022221image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-06-11T10:22:59.492802image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-06-11T10:23:01.050725image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-06-11T10:22:48.113631image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-06-11T10:22:49.460643image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-06-11T10:22:51.023784image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-06-11T10:22:52.444624image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-06-11T10:22:53.850911image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-06-11T10:22:55.209809image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-06-11T10:22:56.804903image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-06-11T10:22:58.185963image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-06-11T10:22:59.651098image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-06-11T10:23:01.219003image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-06-11T10:22:48.252508image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-06-11T10:22:49.621011image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-06-11T10:22:51.194664image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-06-11T10:22:52.590395image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-06-11T10:22:53.996097image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-06-11T10:22:55.383522image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-06-11T10:22:56.946540image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-06-11T10:22:58.338935image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-06-11T10:22:59.777592image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Correlations

2023-06-11T10:23:07.655623image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Unnamed: 0df_indextotal wordsvividnesspassive voiceall adverbsly-adverbsnon-ly-adverbsyearnum final genres
Unnamed: 01.0001.000-0.028-0.0170.0100.0100.0120.002-0.012-0.043
df_index1.0001.000-0.028-0.0170.0100.0100.0120.002-0.012-0.043
total words-0.028-0.0281.000-0.017-0.004-0.0040.024-0.0200.0130.112
vividness-0.017-0.017-0.0171.000-0.520-0.344-0.261-0.3120.0600.119
passive voice0.0100.010-0.004-0.5201.0000.3340.1010.4640.0440.133
all adverbs0.0100.010-0.004-0.3440.3341.0000.8030.795-0.1390.011
ly-adverbs0.0120.0120.024-0.2610.1010.8031.0000.317-0.149-0.047
non-ly-adverbs0.0020.002-0.020-0.3120.4640.7950.3171.000-0.0600.071
year-0.012-0.0120.0130.0600.044-0.139-0.149-0.0601.0000.003
num final genres-0.043-0.0430.1120.1190.1330.011-0.0470.0710.0031.000

Missing values

2023-06-11T10:23:01.444376image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
A simple visualization of nullity by column.
2023-06-11T10:23:01.777189image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

Unnamed: 0df_indextitleauthortotal wordsvividnesspassive voiceall adverbsly-adverbsnon-ly-adverbsyeargenresnum final genres
000The Vanished BirdsSimon Jimenez124205.055.186.371.950.361.582020.0['Science Fiction', 'Fantasy', 'Adult']3
111The Price of HonorJonathan P. Brazee77253.035.358.712.630.711.922017.0['Science Fiction']1
222The Case of the Baker Street IrregularsAnthony Boucher80557.032.338.413.721.642.081940.0['Mystery', 'Crime', 'Classics']3
333WildoakC. C. Harrington55602.074.346.923.041.161.872022.0['Historical Fiction', 'Young Adult']2
444The HolidayT. M. Logan101767.050.308.023.061.121.932019.0['Thriller', 'Mystery', 'Crime', 'Suspense']4
555Fab: An Intimate Life of Paul McCartneyHoward Sounes225785.039.246.523.031.201.832010.0['Biography', 'History']2
666Honest illusionsNora Roberts163279.055.458.312.720.951.771992.0['Romance', 'Contemporary', 'Mystery', 'Suspense']4
777Fifth BusinessRobertson Davies103691.029.828.803.461.122.341970.0['Classics', 'Historical Fiction', 'Literary Fiction']3
888Spy Above the CloudsCiji Ware161061.047.386.983.321.351.972021.0['Historical Fiction']1
999If ThenJill Lepore109677.027.465.772.160.771.392020.0['History']1
Unnamed: 0df_indextitleauthortotal wordsvividnesspassive voiceall adverbsly-adverbsnon-ly-adverbsyeargenresnum final genres
204802048022122Star Wars - The Mandalorian: Junior NovelJoe Schrieber39247.052.425.992.310.851.462022.0['Science Fiction']1
204812048122124The Three Secret CitiesMatthew Reilly90879.059.785.772.671.151.532018.0['Thriller', 'Adventure', 'Fantasy', 'Mystery', 'Suspense']5
204822048222125The Love Songs of W. E. B. Du BoisHonorée Fanonne Jeffers260464.047.468.672.430.541.902021.0['Historical Fiction', 'Literary Fiction', 'Historical']3
204832048322126In Veritas -C. J. Lavigne128055.082.666.832.741.151.592020.0['Adult', 'Fantasy', 'Science Fiction']3
204842048422127Star Wars - The New Jedi Order: Vector PrimeR. A. Salvatore118450.042.305.463.381.481.901999.0['Fantasy', 'Science Fiction']2
204852048522128Blessing in DisguiseDanielle Steel82682.025.4710.013.241.032.222019.0['Romance']1
204862048622256MothJames Sallis56267.052.277.193.371.102.271993.0['Mystery', 'Crime', 'Thriller']3
204872048722361True BlueJane Smiley73746.050.618.613.210.562.652011.0['Young Adult', 'Historical Fiction']2
204882048822417Cage of GlassGenevieve Crownson70143.049.937.623.111.231.88NaN['Young Adult', 'Science Fiction']2
204892048922428The Singing SandsJosephine Tey68986.040.439.292.780.782.001952.0['Mystery', 'Crime', 'Thriller']3